Performance database: capturing data for optimizing distributed streaming workflows
نویسندگان
چکیده
منابع مشابه
Performance database: capturing data for optimizing distributed streaming workflows.
The performance database (PDB) stores performance-related data gathered during workflow enactment. We argue that, by carefully understanding and manipulating these data, we can improve efficiency when enacting workflows. This paper describes the rationale behind the PDB, and proposes a systematic way to implement it. The prototype is built as part of the Advanced Data Mining and Integration Res...
متن کاملPerformance Database: Capturing Data for Optimising Distributed Streaming Workflows
It is evident that data-intensive research is transforming the computing landscape, as recognised in “The Fourth Paradigm” [1]. Due to the scale, complexity and heterogeneity of data gathered in scientific experiments, we can not naively dumping the data into computing resources and hoping to extract useful information and knowledge through exhaustive and unstructured computations. To survive t...
متن کاملSeparating indexes from data: a distributed scheme for secure database outsourcing
Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...
متن کاملFuzzy Data Envelopment Analysis for Classification of Streaming Data
The classification of fuzzy uncertain data is considered one of the most challenging issues in data analysis. In spite of the significance of fuzzy data in mathematical programming, the development of the analytical methods of fuzzy data is slow. Therefore, the current study proposes a new fuzzy data classification method based on fuzzy data envelopment analysis (DEA) which can handle strea...
متن کاملParallelizing XML data-streaming workflows via MapReduce
In prior work it has been shown that the design of scientific workflows can benefit from a collection-oriented modeling paradigm which views scientific workflows as pipelines of XML stream processors. In this paper, we present approaches for exploiting data parallelism in XML processing pipelines through novel compilation strategies to the Map-Reduce framework. Pipelines in our approach consist...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
سال: 2011
ISSN: 1364-503X,1471-2962
DOI: 10.1098/rsta.2011.0134